Relation Extraction for Ontology Construction

نویسنده

  • Kate Byrne
چکیده

This proposal is for a programme of work leading to the building of a data querying application. The starting point is a collection of relational databases holding cultural heritage material from the National Collections of Scotland. The data is a mixture of fixed fields and free text, supported by background material such as domain thesauri. The goal is to produce a system for running queries against this material that does not assume the user has expert knowledge of the data structure or the specialist domain terminology. Two core tasks are proposed as necessary steps towards the goal: • Extraction of two-place relations from free text: The purpose of this step is to translate key facts from the textual material into a standardised format. A combination of rule-based and machine learning approaches is planned. • Automatic assembly of all relevant data into an ontology: An ontology is defined here as a graph of two-place relations where the edges represent predicates and the nodes the entities they apply to. All of the relevant information — from database fields, domain thesauri and the extracted textual relations — will be combined into such a graph. The application will run against the populated ontology. An interactive interface is proposed, in which a tailored summary based on the user’s initial query is generated and the user is then invited to refine the query based on this information. Evaluation of each subtask is planned, and the overall criteria for success will be that query performance is comparable to what is currently available in terms of speed and range, whilst also producing improved results for the non-expert user.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-automatic Domain Ontology Construction from Spoken Corpus in Tunisian Dialect: Railway Request Information

In this paper, we present a hybrid method for semi-automatic building of domain ontology from spoken dialogue corpus in Tunisian Dialect for the railway request information domain. The proposed method is based on a statistical method for term and concept extraction and a linguistic method for semantic relation extraction. This method consists of three fundamental phases, namely the corpus const...

متن کامل

Automatic Thai Ontology Construction and Maintenance System

Ontology is an essential resource to enhance the performance of Information Processing system such as information integration, document classification in taxonomies, including information retrieval and data cleaning in database system. This paper proposes three methodologies for Automatic Thai Ontology Construction and Maintenance from technical corpus, dictionary and thesaurus. For corpus base...

متن کامل

Automatic Wayang Ontology Construction using Relation Extraction from Free Text

This paper reports on our work to automatically construct and populate an ontology of wayang (Indonesian shadow puppet) mythology from free text using relation extraction and relation clustering. A reference ontology is used to evaluate the generated ontology. The reference ontology contains concepts and properties within the wayang character domain. We examined the influence of corpus data var...

متن کامل

Relation Extraction Based on AGROVOC

Relation extraction is an important step in Ontology construction. This paper provides a method to extract relations among conceptions with AGROVOC.

متن کامل

Taxonomic Relation Extraction from Wikipedia: Datasets and Algorithms

The dynamic and continuously growing category structure of Wikipedia has been used in numerous ontology extraction methods. We present a dataset of category subgraphs automatically extracted from Wikipedia that are manually annotated for is-a and instance-of relations in order to enable a more comprehensive evaluation of taxonomy mining approaches. We also show how the new dataset can be used w...

متن کامل

Creation of a bottom-up corpus-based ontology for Italian Linguistics

This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a metasearch engine for query refinement. The ontology was constructed with the software Protégé 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006